-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MRESOLVER-247: Introduce skipper to avoid unnecessary dependency reso… #158
Conversation
...er-impl/src/main/java/org/eclipse/aether/internal/impl/collect/DependencyResolveSkipper.java
Outdated
Show resolved
Hide resolved
...-impl/src/main/java/org/eclipse/aether/internal/impl/collect/DefaultDependencyCollector.java
Show resolved
Hide resolved
...er-impl/src/main/java/org/eclipse/aether/internal/impl/collect/DependencyResolveSkipper.java
Outdated
Show resolved
Hide resolved
...er-impl/src/main/java/org/eclipse/aether/internal/impl/collect/DependencyResolveSkipper.java
Outdated
Show resolved
Hide resolved
...er-impl/src/main/java/org/eclipse/aether/internal/impl/collect/DependencyResolveSkipper.java
Outdated
Show resolved
Hide resolved
...mpl/src/test/java/org/eclipse/aether/internal/impl/collect/DependencyResolveSkipperTest.java
Outdated
Show resolved
Hide resolved
...mpl/src/test/java/org/eclipse/aether/internal/impl/collect/DependencyResolveSkipperTest.java
Outdated
Show resolved
Hide resolved
...mpl/src/test/java/org/eclipse/aether/internal/impl/collect/DependencyResolveSkipperTest.java
Outdated
Show resolved
Hide resolved
...mpl/src/test/java/org/eclipse/aether/internal/impl/collect/DependencyResolveSkipperTest.java
Outdated
Show resolved
Hide resolved
...mpl/src/test/java/org/eclipse/aether/internal/impl/collect/DependencyResolveSkipperTest.java
Outdated
Show resolved
Hide resolved
...-impl/src/main/java/org/eclipse/aether/internal/impl/collect/DefaultDependencyCollector.java
Outdated
Show resolved
Hide resolved
...l/src/test/java/org/eclipse/aether/internal/impl/collect/DefaultDependencyCollectorTest.java
Show resolved
Hide resolved
...l/src/test/java/org/eclipse/aether/internal/impl/collect/DefaultDependencyCollectorTest.java
Outdated
Show resolved
Hide resolved
Please also help check the PR to see if it is good for you to apply your changes in MRESOLVER-7. |
b75db9a
to
97cea5d
Compare
Thanks for the code review. Fixed already. Please let me know if anything else I could help. |
.../src/test/java/org/eclipse/aether/internal/impl/collect/DependencyResolutionSkipperTest.java
Show resolved
Hide resolved
.../src/test/java/org/eclipse/aether/internal/impl/collect/DependencyResolutionSkipperTest.java
Show resolved
Hide resolved
.../src/test/java/org/eclipse/aether/internal/impl/collect/DependencyResolutionSkipperTest.java
Show resolved
Hide resolved
I will leave @ibabiankou at least a week to test and respond. |
There is a regression. Have a look at: mng-5669.zip |
Actually this is expected behavior. Skipper would simply skip resolving the conflict losers which will make maven read/parse less poms. Regarding the source poms as below, they are actually read only once. This means function in MNG-5669 does not break.
|
(all 3 building current mvn master 97c1e4b4aa73f5851801fdf5e17f013dd46a6b3e w/ empty local repo, primed proxy, no tests executed) |
Legend:
|
The one failing IT fix: Reason is simple: this IT "counts" ALL the resolution that maven (erm, resolver) does, and fails due substantial difference between "old" resolver (DFS and w/o skipper) and this PR (BFS + skipper). The PR above changes the Maven IT to not "lock down" the read POM count, but still assert that POMs are read only once. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
One remark: maybe a config param to DISABLE skipper? As sometimes inspecting the "whole overloaded" graph is useful, no? And skipper is added always currently. So maybe an another skipper implementation that is "no-skip"? |
This is actually fantastic since we save time when we skip. @jebeaudet Can you test this at work to speed up your build as well? |
This makes sense. @caiwei-ebay Do you see a way to make this configurable? |
...-impl/src/main/java/org/eclipse/aether/internal/impl/collect/DefaultDependencyCollector.java
Outdated
Show resolved
Hide resolved
This look very promising. Please document this parameter in the documentation page. |
I'll leave Ivan one more week to respond since he's likely having a hard time. |
Please check ada9970 |
...-impl/src/main/java/org/eclipse/aether/internal/impl/collect/DefaultDependencyCollector.java
Outdated
Show resolved
Hide resolved
...-impl/src/main/java/org/eclipse/aether/internal/impl/collect/DefaultDependencyCollector.java
Outdated
Show resolved
Hide resolved
Project: maven master 31193cbf0c93205a63c8c7b372b09200f60e69f4 mvn 3.8.5 (resolver 1.6.x + wagon, mvn 3.8.5 is on path): mvn this PR (resolver master w/ bfs + transport-http + skipper): mvn this PR (resolver master w/ bfs + transport-http + skipper) + SKIPPER DISABLED: |
Am getting consistently slower results (2nd, 3rd run), but unsure is it skipper or BFS that is on master? @caiwei-ebay could you tell more about those projects you refer to? I guess they are not open (if they are, please point us on them), but I wonder what they have in common: size (as in module count/reactor size)? are they using version ranges? Are they having unused dependencies (as you refer to "make dep tree clean, remove unused dependencies")...? |
Thanks for the test report. I'm not sure if I can find a opensource project that can reproduce such issue, even entire spring-boot project can print dependency:tree in 20s and does not have such issue. Or how about creating a mock app to demo the slowness case? Regarding the test method, could you please build mvn bfs with code checked out before the "BFS approach" commit and run the performance test? This can help narrow down the root causes, either because of the BFS commit or commits before BFS. This means the 1st case would be:
|
Project: Maven master 31193cbf0c93205a63c8c7b372b09200f60e69f4 Contenders:
3 runs as before: 1st is with empty local repository, going against central directly. mvn-385: mvn-df: mvn-bf: mvn-bf w/ skipper disabled: |
@caiwei-ebay @michael-o @ibabiankou @jebeaudet IMO, this PR is good to be merged, so let's merge it. I personally am not quite convinced about this (BFS + skipper), I'd need more tests, and would like to see how these "scale" as well (test on project with 100, 200, 400 modules and so on). So, once merged, I'd like to implement following changes:
I am aware there is still one incoming change (parallel POM download), and BFS is needed for it, but once it lands as well, we can prove (or prove the opposite) and just drop the simple "indirection". But while not there, having both in maven will greatly simplify testing, and give possibility for users to change collection (even if we leave it in resolver upon release). For start, I plan just to "copy paste" the two collector components (one from here and one from master) that will most probably introduce quite some code duplication, but for now is fine. IF we decide to drop one, OR to keep both, we can clean up the duplication by refactoring (or just by dropping the one not needed anymore). |
I fantastically support this. I'd like to hear @ibabiankou'Ss opinion as well. |
Actually Skipper does not help with multi module project. Maven builds module one by one, this means Maven will call collectDependencies one by one for each module, and thus the cache (Args.pool, not a static one) won't be reused among the resolution of multiple modules. This means if we can reuse Args.pool (make map static & concurrent safe here), it should be faster when building multiple modules. I would like to try that in near future, maybe after @ibabiankou's PR. Skipper only helps avoid unnecessary duplicate node resolution, it does least help for maven projects developed by maven experts like Spring as they have managed dependency versions very well and exclusions are rarely used. To print dependency:tree of entire spring-boot project (with large number of modules), it would take just 20s with my windows desktop. To be frank, it is not a common case in Opensource or people using Opensource project, this is why I say this skipper solution is only helpful for large projects with complex dependencies (micro service tech is not employed, people rely on libraries developed by other team to call APIs provided by them). The tests here actually demonstrate the case: same node with different exclusions, the 2nd one will be skipped as it is the loser node. |
... or make it a component. Will discuss this with @gnodet as well, as mvnd keeps contaner (hence components) for longer, while mvn CLI only during run... |
Created #160 |
RU sure about this? As pool is created out of session ( EDIT: I missed your point, member DataPool.nodes is RE-CREATED. |
@cstamas Another thing we could do later is to reverse versions (already reversed in the skipper PR) and add a break once at least one version has been resolved to speed up version range resolution which is for MRESOLVER-133. code here |
@michael-o @caiwei-ebay @ibabiankou where are we with this PR? IMO, it should be safe to merge, especially as I mentioned, I plan to retain "old" (DFS) resolver as well, the separation works well (I will update or redo my PR as merge of this one with conflict). |
I tested that locally, and it miserably failed (resolver build passed, but building anything with mvn using altered resolver failed, usually with some dep not on classpath). To me it seems re-using nodes cache is a no-go, still unsure why.... My initial guess: due post-processing (like scope etc). So maybe we need to cache "raw" graph instead (but that would require a LOT of heap). Also, will look into at least carry-on caching of resolved versions, as we do that as well over and over again.... |
Will get back to this end of week. |
I'm totally fine. Please go ahead with your proposals. Thanks! |
@michael-o @cstamas @jebeaudet @ibabiankou mvn-df: ~/bin/apache-maven/maven-master-df/bin/mvn -Dmaven.repo.local=local-repo clean package -Drat.skip -Dtest=et -DfailIfNoTests=false mvn-bf: ~/bin/apache-maven/maven-master-bf/bin/mvn -Dmaven.repo.local=local-repo clean package -Drat.skip -Dtest=et -DfailIfNoTests=false Below is the commit including some minor changes to reduce object creations and array copies which might be helpful to fix the slight performance downgrade: As I don't have your test project, I was verifying it by running dependency tree of spring-boot project (mvn dependency:tree -o) and saw a bit improvement (17.7s -> 17.1s). |
@michael-o @cstamas @jebeaudet |
I will try this tomorrow. Unfortunately, @ibabiankou is out of reach. If I don't hear any objections, I am going to merge this this weekend then @cstamas can work on top to create a switch between DFS and BFS. |
Testing now... |
Applied to Maven master:
|
Applied to the same project:
|
Measured with:
Maven with three modified Resolvers:
CPU:
There is obviously a slight improvement, but not in my setup. I agree with @cstamas that DFS should stay default, both switchable until we get a clearer picture. |
@michael-o @cstamas Not sure if @ibabiankou has time to continue on the parallel downloading patch based on BFS approach. @ibabiankou Would you mind if I implement it for you? |
@caiwei-ebay Can you first please pick up this PR #160 to make it coexist? I will merge this is a few moments. |
@michael-o @cstamas @weibo1995 @ibabiankou Echoed weibo1995's feedback in #172
Really glad that this patch could help more developers. Also I'm currently working on parallel pom downloading, the idea is a bit different than https://github.com/ibabiankou/maven-resolver/pull/2/files:
Here is the commit for preview and the code is still evolving, I will certify the approach by dry-running large numbers of apps in our company, please share your comments with the idea. Note: I'm resetting the 2FA auth of original "caiwei-ebay" account and unable to fire PR with "caiwei-ebay2" (getting message saying: It looks like this is your first time opening a pull request in this project!). Will fire draft PR once 2FA auth is reset and we can discuss there later. |
Parallel POM downloading in DF collector. - only parallelize the VersionRange and ArtifactDesciptor resolution part (poms.xml & maven-metadata.xml downloading in parallel) as the poms downloading or maven-metadata.xml is the most slow part. - Pure dependency resolution logic and the [skip logic](#158) are still run in sequence. --- https://issues.apache.org/jira/browse/MRESOLVER-7
Please check implementation details in https://issues.apache.org/jira/browse/MRESOLVER-247.
As discussed in MRESOLVER-228, here is the plan for multiple changes requested recently:
Implementation
This PR is for skip approach. It implements a skipper based on BFS to skip unnecessary dependency resolution for below cases:
The skip logic resides with DependencyResolutionSkipper's skipResolution, here is the flow:
Each node has 4 states after calling skipResolution:
Force resolving partial duplicate nodes
Why?
This is because Maven picks up widest scope present among conflicting dependencies (scopes of the conflicts may differ), we need to retain the conflict paths for scope selection. Please check ConflictIdSorter and JavaScopeSelector for more details.
How?
Each node in the tree will have a coordinate in form of (depth, sequence of given depth), ex (2, 3) represents the 3rd node in depth 2.
Given multiple R nodes (same GAV) in the tree,
R#1 is resolved, R#1 is the winner as it is the first R node resolved in BFS approach (already the R node nearest to root)
If R#2 is more left than R#1 (current leftmost R node), force resolve R#2, mark R#2 is the leftmost node of R artifact
If R#3 is more left than R#2 (current leftmost R node), force resolve R#3, mark R#3 is the leftmost node of R artifact
If R#4 is not left than R#3 (current leftmost R node), skip R#4
If R#5 is not left than R#3 (current leftmost R node), skip R#5