Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support caching java dependencies for maven projects #664

Open
pierDipi opened this issue Sep 25, 2024 · 10 comments
Open

Support caching java dependencies for maven projects #664

pierDipi opened this issue Sep 25, 2024 · 10 comments

Comments

@pierDipi
Copy link

pierDipi commented Sep 25, 2024

I'm trying to have hermetic builds for a Java project using Maven in Konflux, however, since cachi2 doesn't support caching maven dependencies, the only reasonable way I found was to create an intermediate image that downloads dependencies [1] but that doesn't pass the default enterprise contract since using intermediate images causes the error Base image "xyz" is from a disallowed registry.

Here is the PR with proof of concept using the intermediate image openshift-knative/eventing-kafka-broker#1273

[1] https://github.com/openshift-knative/eventing-kafka-broker/blob/e1355b833093404b5e5e13f5a7bba1fc241cf49c/openshift/ci-operator/static-images/dispatcher/konflux/Dockerfile.deps


Potential Solution

To cache dependencies in Maven, that also supports multiple modules, we need to:

  • copy all pom.xml files in a temporary directory, respecting the file system structure
  • Run mvn package dependency:go-offline <user provided flags> -Dmaven.repo.local=<cachi2-maven-deps-directory>
  • Remove the remote metadata find <cachi2-maven-deps-directory> -path "*_remote.repositories" | xargs -I{} rm {}
@pierDipi
Copy link
Author

pierDipi commented Sep 25, 2024

I see a different approach here #663 that would work for gradle projects too

@kosciCZ
Copy link
Contributor

kosciCZ commented Sep 26, 2024

Hey @pierDipi, how well would you say #663 covers your use case? Is there anything that could be changed to make it work better for your use case (e.g. not to be too much hassle to set up when fetching entire projects)?

(disclaimer: not a cachi2 maintainer, just trying to see if #663 could be improved upon)

@pierDipi
Copy link
Author

pierDipi commented Sep 26, 2024

My only problem with it is that as it is I don't know if we have existing tooling or will provide one to create that custom "lock file".

The project we're trying to build has like 230 dependencies (including transitive ones and it's a relatively small/medium size project), so without a companion tool to help with that it becomes very tedious to create and maintain that lock file over time and, at the same time, I wouldn't want every team to create their own bespoke tool.

@kosciCZ
Copy link
Contributor

kosciCZ commented Sep 26, 2024

My only problem with it is that as it is I don't know if we have existing tooling or will provide one to create that custom "lock file".

The short answer is no, not currently, and most likely not in the future.

The project we're trying to build has like 230 dependencies (including transitive ones and it's a relatively small/medium size project), so without a companion tool to help with that it becomes very tedious to create and maintain that lock file over time and, at the same time, I wouldn't want every team to create their own bespoke tool.

I definitely agree with this sentiment.

While I can see how #663 could be used for your use case, I think it is not a 100% match. The feature, as I understand it, is more for fetching one-off artifacts from maven, when fetching an entire build (or all its dependencies) is inefficient or costly.

Again, not a cachi2 maintainer, or a java expert in any way, but I can possibly see your use case as a separate cachi2 package manager, if that's a typical way for a java project to be structured and set up for a hermetic build.

@aloubyansky
Copy link

For Java, there is no reliable alternative to capturing all the necessary dependencies for a build besides running the build. So there is that as the first step in prefetching.

We could generate a lockfile from a Maven repo content on disk except in some edge cases it will be challenging to determine the proper values of classifiers, versions and types based on filenames.
An alternative approach could be to create a Maven extension that would register a repository listener to listen to artifact resolution events and record each artifact resolved. However there could plugins that initialize their own resolves and registering repository listeners. An extra check may need to be done to make sure all the artifacts in a local repo have been properly captured in the lockfile.

@pierDipi
Copy link
Author

For Java, there is no reliable alternative to capturing all the necessary dependencies for a build besides running the build. So there is that as the first step in prefetching.

I may be missing something or the logic is flawed in some cases but in the POC here openshift-knative/eventing-kafka-broker#1273, I don't think the actual build is running when caching dependencies since there is no project code to build at that stage.

I had to add all the modules to the maven reactor via package but without including the code it doesn't build the project

At an high level what I did was:

Ensure maven-dependency-plugin is using 3.8+

    <pluginManagement>
      <plugins>
        <plugin>
          <artifactId>maven-dependency-plugin</artifactId>
          <version>3.8.0</version>
        </plugin>
      <plugins>
    </pluginManagement>

Cache dependencies (generic step)

#!/usr/bin/env bash

set -euo pipefail

project_dir="$(pwd)"
cachi2_temp_build="$(mktemp -d)"

trap 'rm -rf ${cachi2_temp_build}' EXIT

# Copy pom.xml files to `cachi2_temp_build` directory
find . -name pom.xml -exec bash -c 'i="$1"; mkdir -p '"${cachi2_temp_build}"'/$(dirname $i); cp "$i" '"${cachi2_temp_build}"'/"$i"' shell {} \;

cd "${cachi2_temp_build}"

# Note: at this point in ${cachi2_temp_build} there are only pom.xml files, there isn't the project code to build.

# Add all the modules to Maven reactor with package, this doesn't build the modules (since there is no code to build)
mvn package dependency:go-offline -Dmaven.repo.local="${cachi2_temp_build}/third_party/maven"

# Save dependencies to ${cachi2_temp_build}/third_party/maven
find "${cachi2_temp_build}/third_party/maven" -path "*_remote.repositories" -exec rm {} \;

Build the jars with offline mode using local dependencies (project specific step)

cd "${project_dir}"
# Build the app
mvn -Dmaven.repo.local="${cachi2_temp_build}/third_party/maven" --offline package -pl=dispatcher-loom  -am -DskipTests

@aloubyansky
Copy link

I don't think the actual build is running when caching dependencies since there is no project code to build at that stage.

Do you create a POM-only module layout w/o adding the source code and run mvn package on it?

dependency:go-offline should be redundant when package is used, since it will run all the phases that require dependency resolution.
What it will be missing is dependencies of plugins that are run after the package phase, such as integration testing (falsesafe), install, deploy, etc.

@aloubyansky
Copy link

What it will be missing is dependencies of plugins that are run after the package phase, such as integration testing (falsesafe), install, deploy, etc.

Actually, that's wrong, dependency:go-offline does download the plugin dependencies.

@pierDipi
Copy link
Author

pierDipi commented Oct 1, 2024

Do you create a POM-only module layout w/o adding the source code and run mvn package on it?

yes, mvn package dependency:go-offline -Dmaven.repo.local="${cachi2_temp_build}/third_party/maven"

@lkolacek
Copy link
Contributor

Hi guys, thank you for opening this issue and the ongoing discussion. We don't plan to implement Java support to Cachi2, but of course, you're more than welcome to try to contribute to our project. We would much appreciate it, and we'll try to support you with reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants