Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend the packages endpoint with the shortest dependency path #1882

Conversation

lamppu
Copy link
Contributor

@lamppu lamppu commented Jan 27, 2025

Add information about shortest dependency paths to the packages endpoint response. This will give a quick overview for how deeply nested a dependency is in the project. For simplicity, only one path for a package for each project is returned (only the shortest path across scopes is picked).

Please see the individual commits for details.

Screenshot from 2025-02-17 15-10-53

@lamppu lamppu force-pushed the 1046-extend-the-packages-endpoint-with-the-shortest-dependency branch 2 times, most recently from 8075d4e to 3ab2801 Compare January 28, 2025 14:07
@lamppu lamppu marked this pull request as ready for review January 28, 2025 14:34
@lamppu lamppu force-pushed the 1046-extend-the-packages-endpoint-with-the-shortest-dependency branch from 3ab2801 to 95a3c45 Compare January 29, 2025 08:33
@lamppu lamppu requested a review from sschuberth January 29, 2025 08:37
@lamppu lamppu force-pushed the 1046-extend-the-packages-endpoint-with-the-shortest-dependency branch 2 times, most recently from 4f3a8f8 to 80e9b83 Compare January 30, 2025 07:34
Copy link
Contributor

@sschuberth sschuberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM, but I'd also like @mnonnenmacher to have a look.

@lamppu lamppu force-pushed the 1046-extend-the-packages-endpoint-with-the-shortest-dependency branch from 80e9b83 to cb89d6a Compare January 30, 2025 11:40
@lamppu lamppu requested a review from sschuberth January 30, 2025 11:48
@sschuberth

This comment was marked as resolved.

@lamppu lamppu force-pushed the 1046-extend-the-packages-endpoint-with-the-shortest-dependency branch from cb89d6a to 6aa15d5 Compare January 30, 2025 13:23
@lamppu

This comment was marked as outdated.

@lamppu lamppu marked this pull request as draft January 31, 2025 12:39
@lamppu lamppu force-pushed the 1046-extend-the-packages-endpoint-with-the-shortest-dependency branch 5 times, most recently from 15e5014 to be3c51b Compare February 17, 2025 10:16
@lamppu lamppu marked this pull request as ready for review February 17, 2025 10:39
@lamppu lamppu requested a review from mnonnenmacher February 17, 2025 10:39
@sschuberth
Copy link
Contributor

@lamppu, for a quick overview of the feature, could you maybe add a screenshot to the top post?

@Etsija
Copy link
Contributor

Etsija commented Feb 17, 2025

I'm already preparing the UI for this, so will then add a relevant screenshot to my PR as well. As with this feature, the packages table starts to be something useful to users, I'm also adding sorting and filtering to the table.

@sschuberth
Copy link
Contributor

@lamppu, for a quick overview of the feature, could you maybe add a screenshot to the top post?

My bad, I totally forgot that the front-end UI is not implemented yet... that's what I was referring to as a screenshot 😬 No need to add screenshots for code 😅

@lamppu
Copy link
Contributor Author

lamppu commented Feb 17, 2025

I'm also adding sorting and filtering to the table.

I think this is pending backend sorting and filtering, because the API request will be very slow if it's done client side.

@Etsija
Copy link
Contributor

Etsija commented Feb 17, 2025

I'm also adding sorting and filtering to the table.

I think this is pending backend sorting and filtering, because the API request will be very slow if it's done client side.

You're right. Yesterday I thought the difference is negligible, but today I realised that was because I already had the query results in cache for most of my testing. Starting from scratch today showed me that for a case, where with back-end manipulation it took a second to fetch the packages, client-side manipulation exploded the delay to over 5-fold. I'll scratch the client-side manipulation from my UI implementation. But, then we actually need to add sorting and filtering to the API request pretty soon, for that packages table to have any real use, because no one wants to go through a table with thousands of packets page by page.

@sschuberth the picture Johanna provided actually helps me... 😏

Copy link
Contributor

@sschuberth sschuberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits. @mnonnenmacher please also have another look.

/** The shortest dependency path for a Package. */
data class ShortestDependencyPath(
/** The identifier of the root project of this path. */
val projectIdentifier: Identifier,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe just projectId?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since at least the models Project, VulnerabilityWithIdentifier and Package talk about identifiers, I would like to just ensure some consensus on whether and when to shorten it to id before making this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnonnenmacher, what's the guideline here?

Copy link
Contributor

@mnonnenmacher mnonnenmacher Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it makes sense, we use Identifier instead of Id also in other places like the Package class to make it less ambiguous, but this is not done consistently.

Comment on lines +89 to +96
ShortestDependencyPathDao.new {
this.pkg = pkgDao
this.analyzerRun = analyzerRun
this.project = projectDao
this.scope = shortestDependencyPath.scope
this.path = shortestDependencyPath.path
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double-checking: This creates a new ShortestDependencyPathDao, but does not seem to store it anywhere. Is this really correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does create a new table row 1 2, note also that the test in DaoAnalyzerRunRepositoryTest.kt would fail if it didn't, as well as the tests in the following commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct, the new function is running the INSERT statement.

val pkg: Package,

/** Package ID */
val pkgId: Long,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the same as a Package's Identifier, so would it make sense to use a different property name to avoid confusion (but I do not have a good idea for any)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this could be something like pkgDbId to make it explicitly clear that it refers to the database ID (or at least the comment should be better)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also pinging @mnonnenmacher for this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I would recommend to use id for database ids and identifier for ORT identifiers.

@lamppu lamppu force-pushed the 1046-extend-the-packages-endpoint-with-the-shortest-dependency branch 4 times, most recently from 9a1d005 to 94a38ae Compare February 18, 2025 13:04
@lamppu lamppu requested a review from sschuberth February 18, 2025 13:24
sschuberth
sschuberth previously approved these changes Feb 19, 2025
Comment on lines +89 to +96
ShortestDependencyPathDao.new {
this.pkg = pkgDao
this.analyzerRun = analyzerRun
this.project = projectDao
this.scope = shortestDependencyPath.scope
this.path = shortestDependencyPath.path
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct, the new function is running the INSERT statement.


analyzerRunRepository.create(analyzerJobId, analyzerRun, shortestPaths)

dbExtension.db.dbQuery { ShortestDependencyPathsTable.selectAll().count() } shouldBe 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to also verify that the correct data was stored.

import org.eclipse.apoapsis.ortserver.model.runs.Issue
import org.eclipse.apoapsis.ortserver.model.runs.Package
import org.eclipse.apoapsis.ortserver.model.runs.ProcessedDeclaredLicense
import org.eclipse.apoapsis.ortserver.model.runs.Project
import org.eclipse.apoapsis.ortserver.model.runs.ShortestDependencyPath
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message:

only save the shortest of the shortest paths across all scopes

That confuses me, if just "only save the shortest path across all scopes" is correct I would find it easier to understand.


if (existingEntry == null || existingEntry.path.size > pathList.size) {
map[pkgIdentifier.mapToModel()] =
ShortestDependencyPath(projectIdentifier, scope, pathList.map { it.mapToModel() })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a conceptual question: If two scopes have an equally short path, the chosen scope is more or less random. If we show this to users, this can be confusing, for example, if the runtime and the test scopes have an equally short path, and test is chosen here, in the UI it will appear as if this was a test dependency. Should we additionally store all scopes that contain a package so that we can show this as well, to avoid that fool users into thinking some dependencies are test dependencies when they are not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for the UI, I'd like to avoid displaying multiple / all scope with shortest paths (to not overwhelm the user, and to save screen space). How about if we instead name the scope if it's only one, and say "in multiple scopes" if it's more than one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would just not mention the scope at all in this case, does it matter which scope contains the shortest path?
But as a separate feature I would like to store all scope names that a package appears in, because this could also be used to filter packages by scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we show this to users, this can be confusing, for example, if the runtime and the test scopes have an equally short path, and test is chosen here, in the UI it will appear as if this was a test dependency.

Yeah and of course that's also the case if a longer path has a different scope, that scope will "disappear" as well.

But as a separate feature I would like to store all scope names that a package appears in, because this could also be used to filter packages by scope.

So I guess the scopes should be saved separately in their own table, and a separate field for scopes should be added in the response of the packages endpoint and this should be displayed in the details section of the package somehow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would just not mention the scope at all in this case, does it matter which scope contains the shortest path?

I'd also be ok with not displaying the scope at all in the UI, but as I cannot rule out use-cases for having it, I'd still include it in the backend API response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still include it in the backend API response

But if we don't want to show it anywhere and it is randomly selected anyway, why even store it in the database?

I don't know what the use case for showing the shortest path is anyway, so I can't say if it makes sense to show the scope here. Getting a list of direct dependencies that introduce a certain transitive dependency, or respectively getting all paths from different roots, would appear more useful to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what the use case for showing the shortest path is anyway

It's supposed to give you a rough feeling of how much you're affected e.g. by a vulnerability in that dependency, and how complicated it is to swap it out. The idea is that anything "further away" than a direct dependency is harder to swap about, but also the likeliness to trigger vulnerable code might decline.

This whole feature is part of an epic to migrate all features of the WebApp report to the server UI. And as the WebApp shows dependency paths, we want to show them, too.

Getting a list of direct dependencies that introduce a certain transitive dependency, or respectively getting all paths from different roots, would appear more useful to me.

To me, that sounds more like what should eventually be show on the "Dependencies" tab, or be implemented as a "reverse search" feature to show all products that contain a certain dependency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, maybe we should think how we could make it better than the WebApp as part of the migration. But anyway, I don't have any issue with the implementation so I'll approve the PR.

To me, that sounds more like what should eventually be show on the "Dependencies" tab, or be implemented as a "reverse search" feature to show all products that contain a certain dependency.

I thought of this because in the Workbench packages can be filtered by scope.

Signed-off-by: Johanna Lamppu <[email protected]>
Add table to store information about the shortest dependency paths
for a package.

Signed-off-by: Johanna Lamppu <[email protected]>
Extract and save the shortest dependency paths for a package from the
shortest paths provided by the `DependencyNavigator`. For simplicity,
only save the shortest path across all scopes.

Signed-off-by: Johanna Lamppu <[email protected]>
Add information about the shortest dependency paths data to the
response of the packages endpoint. This will later be added to
be shown in the UI in order to get a sense of the importance
to address issues found in the packages.

Signed-off-by: Johanna Lamppu <[email protected]>
@lamppu lamppu force-pushed the 1046-extend-the-packages-endpoint-with-the-shortest-dependency branch from 94a38ae to c22c246 Compare February 19, 2025 13:13
@sschuberth sschuberth added this pull request to the merge queue Feb 19, 2025
Merged via the queue into eclipse-apoapsis:main with commit 4301a86 Feb 19, 2025
27 checks passed
@sschuberth sschuberth deleted the 1046-extend-the-packages-endpoint-with-the-shortest-dependency branch February 19, 2025 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants